Topic-Driven Multi-Document Summarization with Encyclopedic Knowledge and Spreading Activation
نویسنده
چکیده
Information of interest to users is often distributed over a set of documents. Users can specify their request for information as a query/topic – a set of one or more sentences or questions. Producing a good summary of the relevant information relies on understanding the query and linking it with the associated set of documents. To “understand” the query we expand it using encyclopedic knowledge in Wikipedia. The expanded query is linked with its associated documents through spreading activation in a graph that represents words and their grammatical connections in these documents. The topic expanded words and activated nodes in the graph are used to produce an extractive summary. The method proposed is tested on the DUC summarization data. The system implemented ranks high compared to the participating systems in the DUC competitions, confirming our hypothesis that encyclopedic knowledge is a useful addition to a summarization system.
منابع مشابه
Generating Update Summaries with Spreading Activation
For the update summaries task of the Text Analysis Conference 2008 we have implemented a novel summarization technique based on query expansion with encyclopedic knowledge and activation spreading in a large document graph. We have also experimented with sentence compression for building the summaries. The results are average – ranked 27 out of 58 for responsiveness in manual evaluation – but w...
متن کاملMulti-document Summarization by Graph Search and Mate
We describe a new method for summarizing similarities and differences in a pair of related documents using a graph representation for text. Concepts denoted by words, phrases, and proper names in the document are represented positionally as nodes in the graph along with edges corresponding to semantic relations between items. Given a perspective in terms of which the pair of documents is to be ...
متن کاملMulti-Document Summarization by Graph Search and Matching
We describe a new method for summarizing similarities and differences in a pair of related documents using a graph representation for text. Concepts denoted by words, phrases, and proper names in the document are represented positionally as nodes in the graph along with edges corresponding to semantic relations between items. Given a perspective in terms of which the pair of documents is to be ...
متن کاملMulti-Topic Multi-Document Summarization
Summarization of multiple documents featuring multiple topics is discussed. The example trea.ted here consists of fifty articles about the Peru hostage incident tbr ])ecember 1996 through April 1997. They include a. lot of topics such as opening, negotiation, ending, and so on. The method proposed in this paper is based on spreading activation over documents syntactically and semantically annot...
متن کاملSentence Extraction by Spreading Activation with Refined Similarity Measure
Although there has been a great deal of research on automatic summarization, most methods are based on a statistical approach, disregarding relationships between extracted textual segments. To ensure sentence connectivity, we propose a novel method to extract a set of comprehensible sentences that centers on several key points. This method generates a similarity network from documents with a le...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008